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Abstract — A new algorithm, termed subspace evolution and 
transfer (SET), is proposed for solving the consistent matrix 
completion problem. In this setting, one is given a subset of 
the entries of a low-rank matrix, and asked to find one low-rank 
matrix consistent with the given observations. We show that this 
problem can be solved by searching for a column space that 
matches the observations. The corresponding algorithm consists 
of two parts — subspace evolution and subspace transfer. In 
the evolution part, we use a line search procedure to refine the 
column space. However, line search is not guaranteed to converge, 
as there may exist barriers along the search path that prevent 
the algorithm from reaching a global optimum. To address this 
problem, in the transfer part, we design mechanisms to detect 
barriers and transfer the estimated column space from one side of 
the barrier to the another. The SET algorithm exhibits excellent 
empirical performance for very low-rank matrices. 

Index Terms — Matrix completion, subspace. 

I. Introduction 

Suppose that we observe a subset of entries of a matrix. The 
matrix completion problem asks when and how the matrix can 
be uniquely recovered based on the observed entries. This re- 
construction task is ill-posed and computationally intractable. 
■ However, if the data matrix is known to have low-rank, exact 
recovery can be accomplished in efficient manners, provided 
that sufficiently many entries are revealed. Low-rank matrix 
completion problem has received considerable interests due to 
its wide applications, see for example |5| for more details. 

An efficient way to solve the completion problem is via 
' convex relaxation. Instead of looking at rank-restricted ma- 
trices, one can search for the matrix with minimum nuclear 
norm, subject to data consistency constraints. Although in 
general nuclear norm minimization is not equivalent to rank 
minimization, the former approach recovers the same solution 
as the latter if the data matrix satisfies certain incoherence 
conditions [6|. More importantly, nuclear norm minimization 
can be accomplished by polynomial complexity algorithms, 
for example, semi-definite programming or singular value 
thresholding (SVT) H]. 

There are other low-complexity alternatives. Based on the 
subspace pursuit (SP) and CoSaMP algorithms for compres- 
sive sensing Q, IH, the authors of [21 developed the so called 
ADMiRA algorithm. A modification of the power factorization 
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algorithm was used for matrix completion in lO. Another 
approach for solving this problem, termed OptSpace, was 
described in |4|. 

The problem considered in this paper and its algorithmic 
solution differ from all previously published approaches. The 
problem at hand is to identify one low-rank matrix consistent 
with the observations. The solution may or may not be unique. 
In contrast, most results in matrix completion deal with the 
somewhat more restrictive requirement that the reconstruction 
is unique. Hence, our approach can be applied to scenarios 
where the matrix is highly under-sampled, and where po- 
tentially many consistent solutions exist. The relaxation on 
uniqueness allows for the empirically observed performance 
improvement over other completion techniques. 

To solve the consistent matrix completion problem, we 
propose an algorithm, termed subspace evolution and transfer 
(SET). We show that the matrix completion problem can 
be solved by searching for a column (or row) space that 
matches the observations. As a result, optimization on the 
Grassmann manifold, i.e., subspace evolution, plays a central 
role in the algorithm. However, there may exist "barriers" 
along the search path that prevent subspace evolution from 
converging to a global optimum. To address this problem, in 
the subspace transfer part, we design mechanisms to detect and 
cross barriers. Empirical simulations demonstrate the excellent 
performance of the proposed algorithm. 

Despite resembling the OptSpace algorithm ^ in terms of 
using optimization over Grassmann manifolds, our approach 
substantially differs from this algorithm. Searching over only 
one space (column or row space) represents one of the most 
significant differences: in OptSpace, one searches both column 
and row spaces simultaneously, which introduces numerical 
and analytical difficulties. Moreover, when optimizing over the 
column space, one has to take care of "barriers" that prevent 
the search procedure from converging to a global optimum, 
an issue that was not addressed before since it was obscured 
by simultaneous column and row space searches. 

II. Consistent Matrix Completion 
Let X E jjmxn unknown matrix with rank r ^ 

min(TO,n), and let C [m] x [n] be the set of indices of 
the observed entries, where [K] = {1, 2, • • • , A'}. Define the 
projection operator : M'"^^" ^ M"^" by 



The consistent matrix completion problem is to find one rank-r 
matrix X' that is consistent with the observations X^^i, i.e., 

(PO) : find X' such that 

rank {X') < r and (X') = (X) = Xn- (1) 

This problem is well defined as is generated from the 
matrix X with rank r and therefore there must exist at least 
one solution. In this paper, like in other approaches in Q-HI, 
we assume that the rank r is given. In practice, one may try 
to sequentially guess a rank bound until a satisfactory solution 
has been found. 

III. The set Algorithm 

A. Why optimize over column spaces only? 

In this section, we show that the problem (PO) is equivalent 
to finding a column space consistent with the observations. 

Let Um.r be the set of m x r matrices with r orthonormal 
columns, i.e., Um,r ^ {U e M'"'''^ : U'^U = /,.}. Define a 
function 

min \\Xn-^n{UW^)\\l, (2) 

where || ||^ denotes the Frobenius norm. The function / 
captures the consistency between the matrix U and the obser- 
vations Xji: if / (J7) = 0, then there exists a matrix W such 
that the rank-r matrix UW^ satisfies *Po (UW^) = Xq. 
Hence, the consistent matrix completion problem is equivalent 
to 

(PI) : find J7 e Um.r such that / (U) = 0. (3) 

The solution f (U) is not unique in the space Um,r- An 
important property of / is that / (U) = f {UV) for any r- 
by-r orthogonal matrix V, since UW"^ = {UV) {WV)'^ . 
Hence, the function / depends only on the subspace spanned 
by the columns of U, i.e., the span {U). Note that all columns 
of the matrix of the form UW'^ lie in the linear subspace 
span(J7). The consistent matrix completion problem is es- 
sentially finding a column space consistent with the observed 
entries. 

We find the following definitions useful for the exposition 
to follow. The set of all r-dimensional linear subspaces in M" 
is called the Grassmann manifold, and is denoted by Qm^,-- 
Given a subspace ^ € Gm,r, one can always find a matrix 
U G Um,r, such that ^ = span {U). The matrix U is referred 
to as a generator matrix of Although a given subspace 
G Gm.r has multiple generator matrices, a given matrix 
U E Um.r uniquely defines a subspace. For this reason, we 
henceforth use U to represent its generated subspace. 

B. The SET algorithm: a high level description 

Our algorithm aims to minimize the objective function 
f {U), provided that the minimum value of f (U) is known 
to be zero. Ideally, a solution can be obtained by using 
a line search procedure on the Grassmann manifold. Here, 
line search refers to iterative refinements of the interval in 



which the function attains its minimum. Hence, the "subspace 
evolution" part of the algorithm reduces to a well studied 
optimization method. 

The main difficulty that arises during line search, and makes 
the SET algorithm highly non-trivial is when during the search, 
one encounters "barriers". Careful inspection reveals that the 
objective function / can be decomposed into a sum of atomic 
functions, each of which involves only one column of X^ 
(see Section IIII-DI for details). Along the gradient descent 
path, these atomic functions may not agree with each other: 
some decrease and some increase. Increases of some atomic 
functions may result in "bumps" in the / curve, which block 
the search procedure from global optima and are therefore 
referred to as barriers. The main component of the "transfer" 
part of the algorithm is to identify whether there exist barriers 
along the gradient descent path. Detecting barriers is in general 
a very difficult task, since one does not know the locations of 
global minima. Nevertheless, we observe that barriers can be 
detected by the existence of atomic functions with inconsistent 
descent directions. When such a scenario is encountered, the 
algorithm "transfers" the starting point of line search to the 
other side of the barriers, and proceeds from there. Such 
a transfer does not overshoot global minima as we enforce 
consistency of the steep descent directions at the points before 
and after the transfer 

In summary, we start with a randomly generated U E Um.r 
and then refine it until f [U] — i). At each iteration, we first 
detect and then cross barriers if there are any, and then perform 
line search. The details of subspace evolution and transfer 
are given in Section IIII-CI and IIII-DI Simulation results are 
presented in Section HVl 

C. Subspace evolution 

Due to space limitation, we focus on the r ~ 1 case in 
Sections IIII-CI and IIII-DI Furthermore, our exposition aims 
to make the algorithmic details as transparent to the readers 
as possible. The highly technical performance and complexity 
analysis of SET for both r — 1 and r > 1 is deferred to the 
journal version of the paper 

For the optimization problem at hand, we shall refine 
the current column space estimate u following the gradient 
descent direction. Here, the lowercase letter u is used to 
emphasize that the U matrix is a vector when r = 1. Let 
Wu be a length-n column vector that achieves / (u), and let 
Xr = Xq — {uw^) . Then the gradient of / at it is given 
by 

V„/ = -2XrW^. (4) 

The gradient descent path is chosen to be the geodesic 
curve on the Grassmann manifold with direction h = 
II V„/||^. A geodesic surve is an analogue of a 
straight line in an Euclidean space: given two points on the 
manifold, the geodesic curve connecting them is the path of 
the shortest length on the manifold. According to [|9] Theorem 
2.3], the geodesic curve starting from u, along h, is given by 

u (t) — ucost + hsint, t E [0,tt) . (5) 

'The gradient is well defined almost everywhere in Um.r- 



We restrict t to the interval [0, tt) because / {u (t)) has period 
TT, i.e., f{u{t + n)) = f{-u{t)) = f{u{t)). Interested 
readers are referred to |i9J for more details on geodesies on 
the Grassmann manifold. 

The subspace evolution part is designed to search for a 
minimizer (in most cases, a local minimizer) of the function 
/ along the geodesic curve. Our implementation includes two 
steps. The goal of the first step is to identify an interval 
[0,tmax] that contains a minimizer. Since f (t) is periodic, 
imax is upper bounded by tt. The second step is devoted to 
locating the minimizer t* £ [0, tmax] accurately by iteratively 
applying the golden section rule |10|. These two steps are 
described in Algorithm [T] The constants are set to e = 10^^, 
ci = (V5 - l) /2, C2 ^ ci/ (1 - ci) and UN ^ 10. Ideally, 
the starting step size e > should be chosen as small as 
possible. We fix it to a constant as computers only have finite 
precision and 10~^ is already sufficiently small in all our 
experiments. 

Algorithm 1 Subspace evolution. 
Input: Xii, Q, u, and UN. 
Output: t* and u{t*). 

Step A: find tmax < ti" such that t* e [0,tmax] 
Let t' = en. 

1) Let t" = C2-t'.lf t" > TT, then imax = tt and quit Step 
A. 

2) If / {u (t")) > f{u (t)), then t^ax = t" and quit Step 
A. 

3) Otherwise, t' = t". Go back to step 1). 
Step B: numerically search for t* in [0,iinax]- 

Let tl = imax/c2, ^2 = imax/c2, ^4 = ^max, and ts = ti + 

c\ (i4 — Let Un ~ \. Perform the following iterations. 

1) If > !{u[t2)) > !{u(iz)l then ti = ts, 
t2 = tz, and t3 = tl + ci (t4 - 

2) Else, = ia, ts = and t2 = + (1 - ci) (t4 - ti). 

3) itn = Un + 1. If ztn > UN , then quit the iterations. 
Otherwise, go back to step 1). 

Let i* — argmin / (it (<)) and compute u (t*). 
te{ti,---:«4} 



D. Subspace transfer 

Unfortunately, the objective function / {u) may not be a 
convex function of u. The described linear search procedure 
may not converge to a global minimum because the search 
path may be blocked by what we call "barriers". We show 
next how to overcome the problem introduced by barriers. 

At this point, we formally introduce the decoupling princi- 
ple: the objection function / {u (t)) is the squared Frobenius 
norm of the residue matrix; it can be decomposed as the sum 
of the squared Frobenius norm of the residue columns. More 
precisely, let x^. e M™^^ be the column of the matrix 
Xn. Let : 'r"'''^ ^ M™''^ be the projection operator 
corresponding to the j*'' column, defined by 
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(a) Contours of /i . (b) Search paths with zooming in. 

Figure 1 : An illustrative example for barriers. 



Then the objective function / (it (t)) can be written as a sum 
of n atomic functions: 

/ (it {t) ) = min JXn-'^n {uw'^) \ \ ^ 
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This principle is essential to understanding the behavior of 

The following example illustrates the concept of a barrier. 
Consider an incomplete observation of a rank-one matrix 
[?, 2, 2]"^ , [2, ?, 1]"'" , where question marks denote that the 
corresponding entries are unknown. It is clear that the objective 
function / (it (t)) is minimized by ux — [2, 1, 1] , i.e., 
/ {ux ) = 0. Suppose that one starts with the initial guess 
It = -^^= [—10, 1, 1]^. The contours of the atomic function 
/i (it), projected on the plane spanned by it2 and its, is 
depicted in Fig. [Ta] All it's with 112 = — 113 lie on the 
contour /i (it) = 8. Computations show that the gradient 
descent direction h is pointing upward. However / (it) = 
/i (it) + /2 (it) = + /2 (it) < 5 < 8. Any gradient descent 
algorithms can not pass through the contour /i (it) = 8. 
Careful tracking of several line search steps (Fig. [Tb] ) shows 
that It (<) will approach [—1,0,0]^, but will never cross the 
contour /i = 8. That is, the contour /i = 8 forms a "barrier" 
for the line search procedure. 

It is possible to detect barriers algorithmic ally. It can be ver- 
ified that fj (it (t)) has a unique minimizer and maximize^ 
given by 



imaxj = argmax/j (it (<)) and tnun.j 

te[0,7r) 



argmin/j (it {t)) 

te[0,7r) 

(7) 

respectively. There are closed-form equations for these two 
quantities given an initial vector it and a direction h. We say 
that the fc*'* column of Xn forms a barrier if there exists a 
j G [n], j ^ k, such that 

1) the maximizer of fk appears before the minimizer of fj, 

i.e., ^max./i: ^ ^minj ^ ^maxj? and 

2) the gradients of / at m (0) and it (tmax.fc) are consistent 

A 

dt ' 



(form a shai-p angle), i.e., ^/ (it (t)) |t=t,„^^ ^ < 0. 



V vq_. , where (vq. 



^The exception is that '^n^ (u) and CPn^. (h) are linearly dependent, which 
happens with zero probability and is ignored here for simplicity. 



When a barrier is detected, we transfer u from one side of 
it to the other In our implementation, we focus on the closest 
barriers to u to avoid overshooting. Define 

~ {j '■ the J*'' column of X^^ admits barriers} , 

j* =argmintpj, and (8) 

k* = argmax {to.k '■ the fc*'' column of forms a barrier 

k 

1 

for the j column of Xq_ > . (9) 
The subspace transfer part is described in Algorithm |2] 

Algorithm 2 Subspace transfer 
Input: Xq, Vl, and u. 
Output: tst and u {tst)- 
Steps: 

1) Compute j and tpj for each column j satisfying 
rank ( [itn, , J ) = 2. 

2) Suppose that there exist barriers. 

a) find j* and k* according to ^ and (|9|l respectively. 

b) Let tst — to.k' and compute u {tst)- 

3) Otherwise, tst — and u {tst) = u. 



IV. Performance Evaluation 

Here, we introduce an error tolerance parameter > 0. In 
practice, instead of requiring exact data matching, it usually 
suffices to have \\'^n{X') - Xn\fp < ee||-^o||^ for some 
small eg. In our simulations, we set eg — 10^^. 

We tested the SET algorithm by randomly generating 
low-rank matrices X and index sets ft. Specifically, we 
decompose the matrix X into X = UxSxV^, where 



Ux e U„ 



Vx e U„,.r, and Sx e 



We generate 



Ux and Vx from the isotropic distribution on the set U„i^r 
and Un,r, respectively. The entries of the Sx matrix are 
independently drawn from the standard Gaussian distribution 
JV {0,1). This step is important in order to guarantee the 
randomness in the singular values of X. The index set is 
randomly generated from the uniform distribution over the set 
{fl' C [m] X [n] : \n'\ = \n\}. 

The performance of the SET algorithm is excellent. We 
tested different matrices with different ranks and different 
sampling rates, defined as |f2| / (to x n). The performance is 
shown in Fig. |2] The performance improvement due to the 
transfer step is significant. We also compare the SET algorithm 
to other matrix completion algorithmo As shown in Figure 
m the SET algorithm outperforms all other tested completion 
approaches. For most realizations, the SET algorithm needs 
less than 500 iterations to converge. However, there are 
examples for which the reconstruction error is still large after 
2000 iterations. Studying such realizations indicates that the 

^Though the SVT algorithm is not designed to solve the problem (PO), we 
include it for completeness. In the standard SVT algorithm, there is no explicit 
constraint on the rank of the reconstructed matrix. For fair comparison, we 
take the best rank-r approximation of the reconstructed matrix, and check 
whether it satisfies the performance criterion. 



major reason for this phenomena is a slow convergence rate: 
after many iterations the barriers are still too far away from 
space U to be detectable. One future research direction is 
therefore to speed up the SET algorithm. As a final remark, 
we notice that there exist a critical range of sampling rates, in 
which the performance deteriorates. As can be observed from 
the figures, this range shifts to the right as the rank increases. 
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Figure 2: Performance of the SET algorithm. 
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Figure 3: Performance comparison. 
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